|
The prisoner's dilemma is a standard example of a game analyzed in game theory that shows why two completely "rational" individuals might not cooperate, even if it appears that it is in their best interests to do so. It was originally framed by Merrill Flood and Melvin Dresher working at RAND in 1950. Albert W. Tucker formalized the game with prison sentence rewards and named it, "prisoner's dilemma" (Poundstone, 1992), presenting it as follows: :Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of communicating with the other. The prosecutors lack sufficient evidence to convict the pair on the principal charge. They hope to get both sentenced to a year in prison on a lesser charge. Simultaneously, the prosecutors offer each prisoner a bargain. Each prisoner is given the opportunity either to: betray the other by testifying that the other committed the crime, or to cooperate with the other by remaining silent. The offer is: : * If A and B each betray the other, each of them serves 2 years in prison : * If A betrays B but B remains silent, A will be set free and B will serve 3 years in prison (and vice versa) : * If A and B both remain silent, both of them will only serve 1 year in prison (on the lesser charge) It is implied that the prisoners will have no opportunity to reward or punish their partner other than the prison sentences they get, and that their decision will not affect their reputation in the future. Because betraying a partner offers a greater reward than cooperating with him, all purely rational self-interested prisoners would betray the other, and so the only possible outcome for two purely rational prisoners is for them to betray each other. The interesting part of this result is that pursuing individual reward logically leads both of the prisoners to betray, when they would get a better reward if they both kept silent. In reality, humans display a systematic bias towards cooperative behavior in this and similar games, much more so than predicted by simple models of "rational" self-interested action. A model based on a different kind of rationality, where people forecast how the game would be played if they formed coalitions and then they maximize their forecasts, has been shown to make better predictions of the rate of cooperation in this and similar games given only the payoffs of the game. An extended "iterated" version of the game also exists, where the classic game is played repeatedly between the same prisoners, and consequently, both prisoners continuously have an opportunity to penalize the other for previous decisions. If the number of times the game will be played is known to the players, then (by backward induction) two classically rational players will betray each other repeatedly, for the same reasons as the single shot variant. In an infinite or unknown length game there is no fixed optimum strategy, and Prisoner's Dilemma tournaments have been held to compete and test algorithms. The prisoner's dilemma game can be used as a model for many real world situations involving cooperative behaviour. In casual usage, the label "prisoner's dilemma" may be applied to situations not strictly matching the formal criteria of the classic or iterative games: for instance, those in which two entities could gain important benefits from cooperating or suffer from the failure to do so, but find it merely difficult or expensive, not necessarily impossible, to coordinate their activities to achieve cooperation. ==Strategy for the prisoners' dilemma== Both cannot communicate, they are separated in two individual rooms. The normal game is shown below: Here, regardless of what the other decides, each prisoner gets a higher reward by betraying the other ("defecting"). The reasoning involves an argument by dilemma: B will either cooperate or defect. If B cooperates, A should defect, because going free is better than serving 1 year. If B defects, A should also defect, because serving 2 years is better than serving 3. So either way, A should defect. Parallel reasoning will show that B should defect. In traditional game theory, some very restrictive assumptions on prisoner behaviour are made. It is assumed that both understand the nature of the game, and that despite being members of the same gang, they have no loyalty to each other and will have no opportunity for retribution or reward outside the game. Most importantly, a very narrow interpretation of "rationality" is applied in defining the decision-making strategies of the prisoners. Given these conditions and the payoffs above, prisoner A will betray prisoner B. The game is symmetric, so Prisoner B should act the same way. Since both "rationally" decide to defect, each receives a lower reward than if both were to stay quiet. Traditional game theory results in both players being worse off than if each chose to lessen the sentence of his accomplice at the cost of spending more time in jail himself. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Prisoner's dilemma」の詳細全文を読む スポンサード リンク
|